Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank

نویسندگان

Katsuhiko Yamamoto

Toshio Irino

Toshie Matsui

Shoko Araki

Keisuke Kinoshita

Tomohiro Nakatani

چکیده

In this study, we develop a new method to realize speech intelligibility prediction of synthetic sounds processed by nonlinear speech enhancement algorithms. A speech envelope power spectrum model (sEPSM) was proposed to account for subjective results on a spectral subtraction, but it is untested by recent state-of-the-art speech enhancement algorithms. We introduce a dynamic compressive gammachirp auditory filterbank as the front-end of the sEPSM (dcGC-sEPSM) to improve the predictability. We perform subjective experiments on speech intelligibility (SI) of noise-reduced sounds processed by the spectral subtraction and a recently developedWiener filter algorithm. We compare the subjective SI scores with the objective SI scores predicted by the proposed dcGC-sEPSM, the original GT-sEPSM, the three-level coherence SII (CSII), and the shorttime objective intelligibility (STOI). The results show that the proposed dcGC-sEPSM performs better than the conventional models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Speech Intelligibility Using a Gammachirp Envelope Distortion Index Based on the Signal-to-Distortion Ratio

A new intelligibility prediction measure, called “Gammachirp Envelope Distortion Index (GEDI)” is proposed for the evaluation of speech enhancement algorithms. This model calculates the signal-to-distortion ratio (SDR) in envelope responses SDRenv derived from the gammachirp filterbank outputs of clean and enhanced speech, and is an extension of the speech based envelope power spectrum model (s...

متن کامل

Event detection of speech signals based on auditory processing with a dynamic compressive gammachirp filterbank

To simulate the perceptual extraction of temporal structures of speech, the authors have been proposing an event-plausibilty model that detects the occurrence of subevents in continuous speech signals based on a auditory processing. One of its core components is the filterbank module that simulates the mechanical frequency analysis of the basilar membrane in the cochlea. In this paper, output b...

متن کامل

Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique

In this paper we introduce a robust feature extractor, dubbed as robust compressive gammachirp filterbank cepstral coefficients (RCGCC), based on an asymmetric and level-dependent compressive gammachirp filterbank and a sigmoid shape weighting rule for the enhancement of speech spectra in the auditory domain. The goal of this work is to improve the robustness of speech recognition systems in ad...

متن کامل

An Auditory Model of Speaker Size Perception for Voiced Speech Sounds

An auditory model was developed to explain the results of behavioral experiments on perception of speaker size with voiced speech sounds. It is based on the dynamic, compressive gammachirp (dcGC) filterbank and a weighting function (SSI weight) derived from a theory of size-shape segregation in the auditory system. Voiced words with and without highfrequency emphasis (+6 dB/octave) were produce...

متن کامل

Development of the MTF-based speech dereverberation method using adaptive time-frequency division

We previously proposed a speech dereverberation method based on the modulation transfer function (MTF) concept. In that model, power envelopes and carriers are decomposed from a reverberant speech signal using a constant-bandwidth filterbank and then are restored in each respective channel using, respectively, power envelope inverse filtering and a carrier regeneration method. In this paper, we...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank

نویسندگان

چکیده

منابع مشابه

Predicting Speech Intelligibility Using a Gammachirp Envelope Distortion Index Based on the Signal-to-Distortion Ratio

Event detection of speech signals based on auditory processing with a dynamic compressive gammachirp filterbank

Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique

An Auditory Model of Speaker Size Perception for Voiced Speech Sounds

Development of the MTF-based speech dereverberation method using adaptive time-frequency division

عنوان ژورنال:

اشتراک گذاری